{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Composite Agents with Docker Container Computer\n",
    "\n",
    "This notebook walks you through running a composed GUI agent using a Docker-based Computer and OpenRouter for the grounding model, paired with a planning model.\n",
    "\n",
    "We'll use the model string:\n",
    "\n",
    "- `\"openrouter/z-ai/glm-4.5v+openai/gpt-5-nano\"` (grounding + planning)\n",
    "\n",
    "Grounding (left) generates actionable UI coordinates; planning (right) reasons and drives steps."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Prerequisites\n",
    "\n",
    "- Docker Desktop or Engine installed and running\n",
    "- An OpenRouter account and API key (https://openrouter.ai/)\n",
    "- (Optional) An OpenAI API key if using `openai/gpt-5-nano` for planning\n",
    "- Python 3.12 environment with `cua-agent` installed"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# If running outside of the monorepo:\n",
    "# %pip install \"cua-agent[all]\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Prepare a Docker Computer\n",
    "\n",
    "We'll follow the documented Docker provider flow (see `docs/content/docs/computer-sdk/computers.mdx`).\n",
    "\n",
    "If you don't have the image yet, either pull or build it locally. Run these in a terminal, not inside the notebook:\n",
    "\n",
    "```bash\n",
    "# Option 1: Pull from Docker Hub\n",
    "docker pull trycua/cua-ubuntu:latest\n",
    "\n",
    "# Option 2: Build locally (from repo root)\n",
    "cd libs/kasm\n",
    "docker build -t cua-ubuntu:latest .\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Set environment keys\n",
    "\n",
    "- Get an OpenRouter API key at https://openrouter.ai/\n",
    "- If using OpenAI for planning, set your OpenAI key as well\n",
    "- You can input them here to set for this notebook session"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "OPENROUTER_API_KEY = (\n",
    "    os.getenv(\"OPENROUTER_API_KEY\") or input(\"Enter your OPENROUTER_API_KEY: \").strip()\n",
    ")\n",
    "os.environ[\"OPENROUTER_API_KEY\"] = OPENROUTER_API_KEY\n",
    "\n",
    "# Optional: if planning model uses OpenAI provider\n",
    "OPENAI_API_KEY = (\n",
    "    os.getenv(\"OPENAI_API_KEY\")\n",
    "    or input(\"(Optional) Enter your OPENAI_API_KEY (press Enter to skip): \").strip()\n",
    ")\n",
    "if OPENAI_API_KEY:\n",
    "    os.environ[\"OPENAI_API_KEY\"] = OPENAI_API_KEY"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create a Docker Computer and a composed agent\n",
    "\n",
    "This uses the documented Docker provider parameters: `os_type=\"linux\"`, `provider_type=\"docker\"`, plus `image` and `name`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import asyncio\n",
    "from computer import Computer\n",
    "from agent import ComputerAgent\n",
    "\n",
    "\n",
    "async def main():\n",
    "    # Launch & connect to a Docker container running the Computer Server\n",
    "    async with Computer(\n",
    "        os_type=\"linux\",\n",
    "        provider_type=\"docker\",\n",
    "        image=\"trycua/cua-ubuntu:latest\",\n",
    "        name=\"my-cua-container\",\n",
    "    ) as computer:\n",
    "        agent = ComputerAgent(\n",
    "            model=\"openrouter/z-ai/glm-4.5v+openai/gpt-5-nano\",\n",
    "            tools=[computer],\n",
    "            trajectory_dir=\"trajectories\",  # Save agent trajectory (screenshots, api calls)\n",
    "        )\n",
    "\n",
    "        # Simple task to verify end-to-end\n",
    "        async for _ in agent.run(\"Open a browser and go to example.com\"):\n",
    "            pass\n",
    "\n",
    "\n",
    "asyncio.run(main())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Notes\n",
    "\n",
    "- Grounding (OpenRouter `z-ai/glm-4.5v`) + Planning (OpenAI `gpt-5-nano`) can be swapped for other providers/models.\n",
    "- If you prefer to avoid OpenAI, choose a planning model on OpenRouter and update the model string accordingly.\n",
    "- Be sure the planning model supports `vision` input and the `tools` parameter.\n",
    "- The agent emits normalized Agent Responses across providers."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
